Relaxing instance boundaries for the search of splitting points of numerical attributes in classification trees

نویسندگان

  • Ester Yen
  • I-Wen Mike Chu
چکیده

We propose a simple heuristic partition method (HPM) of classification tree to improve efficiency in the search for splitting points of numerical attributes. The proposal is motivated by the idea that the selection process of candidates in the splitting point selection can be made more flexible as to achieve a faster computation while retaining classification accuracy. We compare the performance of the HPM against Fayyad’s method, as the latter is the improved version of the standard C4.5 algorithm on the search of splitting points. We demonstrate that HPM is more efficient, in some cases by as much as 50%, while producing essentially the same classification for six different data sets. Our result supports the relaxation of instance boundaries (RIB) as a valid approach that can be explored to achieve more efficient computations. 2006 Elsevier Inc. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining

Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...

متن کامل

Support Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran

Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...

متن کامل

General and Eecient Multisplitting of Numerical Attributes

Often in supervised learning numerical attributes require special treatment and do not t the learning scheme as well as one could hope. Nevertheless, they are common in practical tasks and, therefore, need to be taken into account. We characterize the well-behavedness of an evaluation function, a property that guarantees the optimal multi-partition of an arbitrary numerical domain to be deened ...

متن کامل

Integrated JIT Lot-Splitting Model with Setup Time Reduction for Different Delivery Policy using PSO Algorithm

This article develops an integrated JIT lot-splitting model for a single supplier and a single buyer. In this model we consider reduction of setup time, and the optimal lot size are obtained due to reduced setup time in the context of joint optimization for both buyer and supplier, under deterministic condition with a single product. Two cases are discussed: Single Delivery (SD) case, and Multi...

متن کامل

A note on split selection bias in classification trees

A common approach to split selection in classification trees is to search through all possible splits generated by predictor variables. A splitting criterion is then used to evaluate those splits and the one with the largest criterion value is usually chosen to actually channel samples into corresponding subnodes. However, this greedy method is biased in variable selection when the numbers of t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Sci.

دوره 177  شماره 

صفحات  -

تاریخ انتشار 2007